Hello. And welcome to my talk about the thorny piece of malware. My name is Marian. I'm a
malware analyst. I'm going to spare you my second name now because people seem to have
problems pronouncing it anyway. And I work for the Austrian software company security
software. My talk is going to be about one specifically thorny piece of malware I analyzed
in and out. And I'm going to start off with some fancy fun facts about that sample. And
the rest of the talk is all going to be about analysis issues I have when looking into that.
I'm going to bring two anti‑analysis techniques I encountered in the sample and two more
analyst headaches that still provide problems for reverse engineers. First one of that is
exception handling that can obfuscate the execution path. The second one is chunk code I
encountered in there that was pretty nasty at first glance but then easy to pass by after all.
I'm going to talk about
binary analysis of C++ executables and about multi‑threaded applications for reverse
engineering. All right. Let's start over. Now altogether this is my favorite piece of malware.
Why would it be my favorite piece of malware? Well, I reversed it in and out from top to bottom
and I really had a lot of fun. It is a challenging piece of malware but not impossible to pass by
even for beginners. It's not packed or encrypted but still provides a lot of interesting topics
to research. But what does it do after all? Well, altogether, I summarized it here, it's an Asian
multi‑threaded non‑polymorphic file affecting spy bot. What does it do? It can produce screenshots.
It can produce screen captures and send them to the CNC server. It can delete files, copy files.
It can execute files. Most of all it can update itself so it can download a new version of itself
and execute this one.
So, basically, it can do anything the malware controller wants to.
Anyway, what are the interesting facts about it?
This sample uses structured exception handling to obfuscate its execution path.
That means by throwing deliberate exceptions, the malware author can pass execution control
from one place in the executable to another one, namely the exception handler.
And the interesting thing about exception handlers is in an exception handler, you can
find a new entry point that's going to be executed after the exception handler.
Now, how does this work?
Well, the most accurate documentation I could find still was written in 1997 by a guy called
Matt Piotrek, who is one of my big heroes now, because he did a really nice documentation
on this issue, namely the crash course in the depth of what you can read it.
Here comes the summary.
Obviously, it's a good thing.
This article, actually, exception handling is implemented as a chain of exception handlers,
which is located on stack and intertwined with the function stack frames that are on
there, and it all starts at the thread information block, because every thread has its own chain
of exception handlers.
A reverse engineer can find this through the FS register at offset zero, which points to
an exception registration structure, which looks more or less like this, and in the simplest
case.
This structure contains a pointer to the handler, which could eventually handle the
thrown exception, and a pointer which points to the previous registration block, which
looks like this, and eventually in the end of the chain, there comes a default handler
and, well, a minus one.
All right.
Now, this is based on the stack, intertwined with the function stack frames, and there's
a whole science about building the stack and unwinding the stack.
But what's really interesting for a malware author is, of course, you can register the
same exception handlers and deliberately throw exceptions and control, like, put point
execution flow to some other piece in the code.
Now this looks more or less like this.
If you're inside of a binary and can spot something like FS zero and see the structure
where a new reverse ‑‑ sorry, a new exception handler is linked into that list, that most
likely has to do with exception handling.
Now, I told you there's a pointer in there pointing to the handler code, which would
be the first switch to some other point in the executable for execution, and inside of
this handler now, someone can change the execution flow to a completely different point
inside of the executable.
The magic thing about this is an exception is treated as a software interrupt, which
means every time an exception occurs, the whole context structure of the stack is going
to be saved away and loaded back into the CPU when the exception handler is finished.
And the interesting thing there is that someone can change this context structure and point
to the instruction pointer somewhere completely different.
So yeah, I know there's a lot of people in here getting excited when they hear they can
point to the instruction pointers somewhere.
All right.
Now, I told you today a lot of things have changed, especially concerning C++.
And in visual C++, well, it's still based on structured exception handling.
I showed you before.
But the things have changed mainly is that now every function has its own exception handler
and uses a func infrastructure, which contains information about try blocks and catch blocks.
And I think I need to take a break.
You'd be correct.
We have a little tradition.
Let me tell you all about it.
It involves ‑‑ Louder!
Why?
Why are we making her drink?
Do we have any first timers here in the audience?
So really nobody is a first timer?
None?
Wait.
Okay.
Who is everybody pointing at?
All right.
Get up here.
I can't believe this is the only guy.
That's amazing.
Cheers.
Welcome to DEF CON.
Have a good time.
Thank you.
Cheers.
Okay. Now where was I? All right. Visual C++ structured exception handling. It's still
based ‑‑ sorry. It's still based on the principle of structured exception handling
I showed you before, but now every function has its own dedicated exception handler which
is compiler generated and uses some structure called func infrastructure that contains a
lot of information, namely information for unwinding funclets about try blocks and catch
blocks and well, a lot of pointers to the exception handlers that eventually have to
handle exceptions. Right. There's a built‑in function called CXX frame handler which this
func infrastructure is handed over to and then performs, well, the magic around exception
handling to execute exception handlers. Well, of course. And still as I mentioned, the important
thing there, the exception handler can define your entry point. Now, I pointed to a really
nice diagram that we see in here. Interesting, right? On OpenRC, it is painted
by the CXX frame handler. Now, I pointed to a really nice diagram that we see in here.
Interesting, right? On OpenRC, it is painted by the CXX frame handler. Now, I pointed to a
really nice diagram that we see in here. Interesting, right? On OpenRC, it is painted
by the CXX frame handler. Now, I pointed to a really nice diagram that we see in here.
Interesting, right? On OpenRC, it is painted by the CXX frame handler. Now, I pointed
by the CXX frame handler. Now, I pointed to a really nice diagram that we see in here.
Interesting, right? On OpenRC, it is painted by the CXX frame handler. Now, I pointed to a really
nice diagram that we see in here. Interesting, right? On OpenRC, it is painted by the CXX frame
handler. Now, I pointed to a really nice diagram that we see in here. Interesting, right?
You got it all, right?
Well, I provide some screenshots here from IDA Pro.
Let's get back to the bot.
In practice, this would look like this.
For example, there's a registration sequence.
I hope people can read this.
Maybe.
I don't know.
Well, there's the zero flying by and a new exception is registered at the beginning of
a function.
Sometime later there's an exception happening.
If you can read that call, this will almost never work, right?
Because this memory address put to ECX there is somewhat likely not to be valid.
So there's the exception and the registered exception handler, which causes the system
to execute the compile generated handler.
Then the func infrastructure is handed over to the CXX frame handler which then performs
the magic.
Let's look into this func infrastructure.
In this func infrastructure there's the values that Igor thankfully pointed out in his diagram.
So there's the try block map and the handler array and finally the pointer to the handler
that the user registered.
So there's the user generated handler.
And in there you can find the offset to the new entry point.
If you have a look at the user generated handler, it is really obvious that this handler is
just registered.
It's registered for obfuscation because there's nothing else happening there than the setting
of this offset for the new entry point.
All right.
So much about exceptions.
The second point, the chunk code in the file.
There was really quite a lot of chunk code to find in that sample, which is pretty scary
for young analysts if you see a lot of source and a lot of shifting operations and a lot
of loops that actually don't perform any really useful information in there.
So I was kind of overwhelmed on this chunk code until I found the principle of the chunk
code in my sample.
There's a whole lot of research about chunk code in binary files and the principle of
this chunk code was pretty simple.
It was opac predicates.
Now an opac predicate is something that just ‑‑ well, a branch statement that always
returns true or always returns false and so it's always going to be just execute one branch,
one of the branches that there are, and the other branches gets the chunk code.
So well, in the sample analyst, it looked somewhat like this.
On the right side, you see the screenshots.
On the left side, there's the simplified version.
And if you think through that, the compare statement in the end is never going to produce
any zero flags.
So the chunk not zero is always going to take the green branch.
Right.
You think now this is simple?
It's true.
It's like this all throughout the sample.
It was just as simple.
So what did the analyst do?
I just put the ignore mode on and green branch for precedent.
If you can see these graphics, I'm not sure how the person ‑‑ the yellow boxes are
the productive code and the white boxes are just chunk code.
So this was really pretty simple to get by.
Analyst headaches.
I spent a lot of time in that sample and had a lot of headaches, especially because
of the threads in there.
People have seen the movie.
They dragged me to hell and know what I've been through with that application.
The author of the sample actually has all my respect because he produced this in C++.
This is a simplified version of the threads that I found in there.
There's actually a lot more, but it boils down basically to one thread that manages
the whole instance, namely the ‑‑ well, the file ‑‑ the bot instances that could
start up in the system.
Because eventually there's more than one file.
The files infected that could start up.
Second thread, there was the file infector always infecting processes that would start
up.
A thread machinery that would handle the sending side of the bot, which could send
messages and data to the CNC.
And one side that was the receiving side of the bot.
And of course the CNC command switching.
Now how did I get to that information?
That was pretty tricky and I spent a lot of try and error time in there.
But actually.
What I did was in four steps I realized that I have to spot the really interesting threads
because there's a lot of timing overhead and synchronization going on.
After doing this I had to spot the interface communication and the synchronization methods,
which actually told me a lot about what threads were about, were triggered by specific events.
I will talk a little bit more about this pretty soon.
And the third step, of course, I had to analyze somewhat the function.
The function of the threads to really find out what they do, what information they generated
and where this information would eventually flow to.
Knowing all of this, in the first step I could bring down this big picture of where
this information generated, which thread, which thread, sorry, accepts this information,
processes it and eventually takes any action.
All right.
So if you go back to that diagram, I found four different methods of synchronization
in there.
Which were events.
Triggers.
From sharing the file inspector and from managing the different instances that were started.
Threat messages, which were mainly used at the receiving side of the bot.
IA completion port, which was used to manage the ‑‑ sorry, the receiving side of the
bot, the thread messages for the sending side of the bot.
And the critical sections for data exchange between the threads.
When I had that, I could paint the threads around the synchronization methods.
All right.
Now here comes the last nest in us for today.
C++.
There's actually a lot about reversing of C++ as a whole science.
For people who are interested in that, I collected a lot of links on that research
on the last slide of this PRACY.
But what I actually want to talk about are virtual function calls.
Virtual function calls are really interesting to reverse because they're indirect calls
and they're only fully determinable at runtime.
They stem from the multiple inheritance feature of C++.
All right.
So one of these virtual function calls can actually call into several different methods
at runtime.
They're translated using virtual function tables, which also have a lot in reversing
these sorts of binaries.
I provided an example here.
In this example, there's a virtual function table actually loaded into the register EAX.
And at offset 4 of this virtual function table, there's a method that's going to be called
with this call statement.
There was really sort of a catch me if you can.
Actually I collected another sample from OpenRC and Igor Skokinski because he did a lot of
research on this as well.
Here's one class A where there's two virtual functions defined in there.
Underneath this class definition, you can see the memory layout of class A where there's
a virtual function pointer actually pointing to the virtual function table of class A.
Now, virtual function table is something that just class have that have virtual functions
defined in there.
All right.
And here's the second class B, which also has a really similar layout with two virtual
functions defined in there.
And another interesting thing is the class C because class C inherits probably class
A and class B and implements one virtual function each.
Now the memory layout of class C is somewhat bigger because as it inherits other classes,
it has to include their class layout and also the virtual function pointers in there to
the virtual function tables.
Just that these virtual function tables are now adapted to fit the needs of class C and
point to the actual function offsets of the functions that class C implemented.
All right.
This is really dry to look at code.
Now back to business.
Here's the CNC command switching function, which is a really good example for virtual
function calls.
On there, you see a lot of yellow boxes.
This is all memory allocation for objects that are going to be instantiated in the
green boxes.
And then you see this one pink box, which is the virtual function call, which was actually
used to call into the bot functions.
The bot functions are implemented as derived classes from one bot action super class.
And all had one function overloaded ‑‑ sorry, implemented that was the bot action.
Now I'll provide you another example with the move file object.
Here in yellow, you see the object instantiation ‑‑ sorry, the memory allocation where there's
space reserved for the object that's going to be instantiated a little bit later in
the green box.
And what you see there is a call to a constructor.
Now this constructor actually has a call into the super class constructor as it works with
derived objects.
And there you see the function call.
First we have a table flying by.
I will talk about this in a second.
As I mentioned, this constructor has a call into the base class constructor.
And there you see another virtual function table where there's space reserved for two
virtual functions.
Now in either probe, you check the cross references of this base class constructor.
There are 23 cross references.
And now guess what?
Surprise, there's like 23 bot actions that can be taken by the bot.
All right.
Now knowing this.
Let's see.
Okay.
So the final step is the instance ‑‑ sorry, the call into the function method of the move
file object.
And what you see there is that the function table of the move file object is loaded into
the register.
And the function offset for is called.
Now if you have a look at the virtual function table of the move file object, at offset 4
there is the move file function.
So theory approved.
Using these virtual function tables, you can ‑‑ not easily but pretty fast ‑‑ determine
which functions are going to be called at these virtual function calls.
All right.
This was my presentation.
Here are the promise to yourself links.
The sample is to be found online under the first link.
And well, if there's any questions, you can contact me on Twitter or I'm going to be out
in the hallway to answer your questions or receive critics or anything you want to tell
me now.
Thank you.
Thank you.
Thank you.
